Skip to content

feat: Phase 3 executor with SSE streaming#6

Merged
prosdev merged 6 commits intomainfrom
feat/phase-3-executor-sse
Mar 14, 2026
Merged

feat: Phase 3 executor with SSE streaming#6
prosdev merged 6 commits intomainfrom
feat/phase-3-executor-sse

Conversation

@prosdev
Copy link
Contributor

@prosdev prosdev commented Mar 14, 2026

Summary

  • Add graph executor with SSE streaming, run lifecycle management, and human input pause/resume
  • Add run API routes: POST /graphs/{id}/run, GET /runs/{id}/stream, POST /runs/{id}/resume, GET /runs/{id}/status
  • Add checkpointer parameter to build_graph for interrupt/resume support
  • Split code-reviewer agent into 3 specialized parallel agents (security, logic, quality)
  • Move manual test scripts from scripts/ to tests/manual/

Test plan

All 18 manual tests pass (bash tests/manual/run_all.sh):

Phase 2 builder (existing, moved to tests/manual/)

  • test_01 — Linear graph (FakeListChatModel)
  • test_02 — Real Gemini LLM integration
  • test_03 — Branching with field_equals condition
  • test_04 — Tool node + tool_error condition routing
  • test_05 — Human input interrupt & resume
  • test_06 — Full pipeline (tool + LLM + condition)

Phase 3 executor + SSE (new)

  • test_07 — SSE event sequence: run_started → node_started → node_completed → edge_traversed → graph_completed
  • test_08 — State snapshots in node_completed events evolve across nodes
  • test_09 — Run status transitions (running → completed), duration_ms, final_state
  • test_10 — Human input pause/resume via executor: graph_paused event, submit_resume, completion
  • test_11 — SSE reconnection: replay buffer stores sequential IDs, Last-Event-ID skips seen events
  • test_12 — Concurrent run limit: MAX_RUNS_PER_KEY enforced, different owners independent
  • test_13 — Run timeout: RUN_TIMEOUT_SECONDS triggers error event, loop terminates
  • test_14 — Condition routing SSE: edge_traversed shows correct condition_result per branch
  • test_15 — Tool error routing SSE: tool_error routes success→END, error→llm_err with deferred edge
  • test_16 — Keepalive during pause: no id field, excluded from replay buffer
  • test_17 — DB fallback: format_sse produces correct terminal events for completed/lost runs
  • test_18 — Cancel run: cancel_event terminates execution, error event emitted

Notable findings during testing

  • Deferred condition edges don't emit edge_traversed when routing to END (not a real node) — test_15 verifies via completed node list instead
  • Cancel during pause requires both cancel_event + resume_event to unblock _wait_for_resume — test_18 documents this
  • SSE stream_run_sse queue is single-reader; reconnection test_11 validates replay buffer directly

🤖 Generated with Claude Code

prosdev and others added 6 commits March 14, 2026 03:50
Allow callers to provide an explicit checkpointer for graph compilation.
The executor uses this to enable state snapshots on all graphs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add RunManager for tracking active runs with per-key and global limits.
Execute graphs via astream with state snapshots after each node.
Sequential event IDs for duplicate-free SSE reconnection replay.
Emit node_started before node_completed for each node.
Derive condition_result in edge_traversed from schema branches.
Human-in-the-loop resume with buffered replay (no SSE-listener wait).
Run timeout (5min default) and cancellation via asyncio.Event.
Safe DB updates in exception handlers via _safe_update_run.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
POST /v1/graphs/{id}/run starts execution and returns run_id.
GET /v1/runs/{id}/stream opens SSE with Last-Event-ID reconnection.
POST /v1/runs/{id}/resume accepts any JSON type as human input.
GET /v1/runs/{id}/status supports reconnection with DB fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add paused_node_id/paused_prompt fields to RunContext instead of
  fragile ctx.events[-1] access in status endpoint
- Add RunManager.cancel_all() to avoid accessing private _runs in shutdown
- Document db lifetime intent in start_run route comment

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace monolithic code-reviewer with 3 focused agents:
- security-reviewer (opus): auth, ownership, secrets, SSRF — CRITICAL/WARNING only
- logic-reviewer (opus): correctness, edge cases, race conditions — with confidence levels
- quality-reviewer (sonnet): tests, conventions, readability — capped at 5 suggestions

code-reviewer.md becomes an orchestrator that launches all 3 in parallel.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add 12 manual test scripts (07-18) covering Phase 3 executor and SSE
streaming features. Move all manual tests from scripts/ to
tests/manual/ for better organization.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@prosdev prosdev merged commit 18c01ca into main Mar 14, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant